Skip to content

feat: implement bulk delete and undelete#18

Merged
mweidner037 merged 10 commits intomweidner037:masterfrom
EastSun5566:feat/try-bulk-delete
Dec 9, 2025
Merged

feat: implement bulk delete and undelete#18
mweidner037 merged 10 commits intomweidner037:masterfrom
EastSun5566:feat/try-bulk-delete

Conversation

@EastSun5566
Copy link
Copy Markdown
Contributor

@EastSun5566 EastSun5566 commented Dec 6, 2025

Hi @mweidner037

Thanks for the awesome blog post. I'm trying to read the source and learn by contributing to it.

This PR closes #14 - it optimizes bulk delete / undelete operations when ids are in the same leaf node. I also added a simple actions to run tests on every PR.

If there are any problems, please let me know!

@EastSun5566 EastSun5566 changed the title feat: implements bulk delete and undelete feat: implement bulk delete and undelete Dec 6, 2025
@mweidner037
Copy link
Copy Markdown
Owner

Thank you @EastSun5566 !

It looks like there is a subtle bug in delete: if the first id to delete is not known, but others are known, none of them will be deleted. This is due to:

if (located === null) return this;

I've pushed a unit test demonstrating the issue. Since deleting unknown ids should be rare, it's okay to skip the fast path in this case and fall back on calling deleteOne.

The other thing I wonder is: how hard would it be to also optimize the case where the deleted ids are spread across several leaf nodes? Something like: bulk-delete as many as you can in the first leaf, then if you're not done, call locate again on the next id to delete, etc. It's okay to skip this fast path entirely if you encounter an unknown id.

@EastSun5566
Copy link
Copy Markdown
Contributor Author

EastSun5566 commented Dec 8, 2025

Oh, you're right! First, I implemented a quick fix if first id to delete is not known should fall back to deleting one by one:

if (located !== null) {
 // bulk delete
}

// fall back to delete one by one

Something like: bulk-delete as many as you can in the first leaf, then if you're not done, call locate again on the next id to delete, etc. It's okay to skip this fast path entirely if you encounter an unknown id.

Yeah, that makes total sense. Here's my approach:

I track currentId and ⁠remaining in a loop. For each iteration, I calculate how many ids can be deleted in the current leaf, then bulk delete them. If bulk deletion isn't possible, I fall back to deleting one by one. Then I move to the next leaf and repeat until ⁠remaining reaches 0:

while (remaining > 0) {
  const located = ans.locate(currentId);
  
  if (located === null) {
    // skip and move to next
    continue;
  }

  // calculate how many ids can be deleted in this leaf
  if (canDeleteInLeaf) {
    // bulk delete and move to next
  } else {
    // fallback: delete one by one
  } 
}

I also added two simple tests that demonstrate bulk delete / undelete across multiple leaves. All existing tests pass.
What do you think?

src/id_list.ts Outdated
Comment on lines +595 to +599
if (
leaf.bunchId === currentId.bunchId &&
currentId.counter >= leaf.startCounter &&
canDeleteInLeaf > 0
) {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition should always be true, since the locate result always contains currentId. So you can actually remove the else clause and deleteOne entirely (nice!).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix it! I remove the whole if statement.

src/id_list.ts Outdated
Comment on lines +690 to +694
if (
leaf.bunchId === currentId.bunchId &&
currentId.counter >= leaf.startCounter &&
canUndeleteInLeaf > 0
) {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as for delete (& undeleteOne can be deleted).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also remove it.

src/id_list.ts Outdated
Comment on lines +577 to +578
let remaining = count;
let currentId = id;
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if instead of these two variables, the loop just uses a currentCounter variable that holds currentId's counter. Compare that to id.counter + count in the loop condition.

Your code works as well, this is just a style preference.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! This is much cleaner. I refactored it to use the currentCounter check condition.

@mweidner037 mweidner037 merged commit 81a34f0 into mweidner037:master Dec 9, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize bulk delete

2 participants